Overview

Dataset statistics

Number of variables9
Number of observations36733
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory2.5 MiB
Average record size in memory72.0 B

Variable types

Numeric9

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
AFDP is highly correlated with TIT and 3 other fieldsHigh correlation
TIT is highly correlated with AFDP and 3 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 3 other fieldsHigh correlation
CDP is highly correlated with AFDP and 3 other fieldsHigh correlation
TEY is highly correlated with AFDP and 3 other fieldsHigh correlation
AFDP is highly correlated with TIT and 3 other fieldsHigh correlation
TIT is highly correlated with AFDP and 3 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 4 other fieldsHigh correlation
CDP is highly correlated with AFDP and 4 other fieldsHigh correlation
TAT is highly correlated with GTEP and 2 other fieldsHigh correlation
TEY is highly correlated with AFDP and 4 other fieldsHigh correlation
AFDP is highly correlated with TIT and 2 other fieldsHigh correlation
TIT is highly correlated with AFDP and 3 other fieldsHigh correlation
GTEP is highly correlated with AFDP and 3 other fieldsHigh correlation
CDP is highly correlated with AFDP and 3 other fieldsHigh correlation
TEY is highly correlated with TIT and 2 other fieldsHigh correlation
AH is highly correlated with ATHigh correlation
AT is highly correlated with AH and 5 other fieldsHigh correlation
AFDP is highly correlated with TIT and 4 other fieldsHigh correlation
TIT is highly correlated with AFDP and 4 other fieldsHigh correlation
GTEP is highly correlated with AT and 5 other fieldsHigh correlation
AP is highly correlated with ATHigh correlation
CDP is highly correlated with AT and 5 other fieldsHigh correlation
TAT is highly correlated with AT and 5 other fieldsHigh correlation
TEY is highly correlated with AT and 5 other fieldsHigh correlation

Reproduction

Analysis started2022-03-04 21:42:27.545200
Analysis finished2022-03-04 21:42:39.048605
Duration11.5 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

AH
Real number (ℝ≥0)

HIGH CORRELATION

Distinct25708
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.86701549
Minimum24.085
Maximum100.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:39.152961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum24.085
5-th percentile50.7772
Q168.188
median80.47
Q389.376
95-th percentile97.3934
Maximum100.2
Range76.115
Interquartile range (IQR)21.188

Descriptive statistics

Standard deviation14.46135495
Coefficient of variation (CV)0.1857186237
Kurtosis-0.2745902897
Mean77.86701549
Median Absolute Deviation (MAD)10.164
Skewness-0.6280340401
Sum2860289.08
Variance209.1307869
MonotonicityNot monotonic
2022-03-04T13:42:39.305139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.1246
 
0.1%
100.1446
 
0.1%
100.1642
 
0.1%
100.1542
 
0.1%
100.1138
 
0.1%
100.0934
 
0.1%
100.1333
 
0.1%
100.127
 
0.1%
100.1727
 
0.1%
100.0625
 
0.1%
Other values (25698)36373
99.0%
ValueCountFrequency (%)
24.0851
< 0.1%
24.6661
< 0.1%
25.9871
< 0.1%
26.6151
< 0.1%
27.5041
< 0.1%
29.271
< 0.1%
29.3161
< 0.1%
29.4341
< 0.1%
29.4751
< 0.1%
29.5511
< 0.1%
ValueCountFrequency (%)
100.24
 
< 0.1%
100.191
 
< 0.1%
100.185
 
< 0.1%
100.1727
0.1%
100.1642
0.1%
100.1542
0.1%
100.1446
0.1%
100.1333
0.1%
100.1246
0.1%
100.1138
0.1%

AT
Real number (ℝ)

HIGH CORRELATION

Distinct22523
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.71272625
Minimum-6.2348
Maximum37.103
Zeros0
Zeros (%)0.0%
Negative62
Negative (%)0.2%
Memory size287.1 KiB
2022-03-04T13:42:39.464250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-6.2348
5-th percentile5.75584
Q111.781
median17.801
Q323.665
95-th percentile29.4848
Maximum37.103
Range43.3378
Interquartile range (IQR)11.884

Descriptive statistics

Standard deviation7.447451235
Coefficient of variation (CV)0.4204576488
Kurtosis-0.8265999421
Mean17.71272625
Median Absolute Deviation (MAD)5.945
Skewness-0.04354672221
Sum650641.5735
Variance55.46452989
MonotonicityNot monotonic
2022-03-04T13:42:39.604061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.5258
 
< 0.1%
23.9698
 
< 0.1%
11.928
 
< 0.1%
10.9257
 
< 0.1%
25.5977
 
< 0.1%
20.7527
 
< 0.1%
20.727
 
< 0.1%
18.4317
 
< 0.1%
12.6037
 
< 0.1%
16.7927
 
< 0.1%
Other values (22513)36660
99.8%
ValueCountFrequency (%)
-6.23481
< 0.1%
-6.04211
< 0.1%
-5.97931
< 0.1%
-5.90311
< 0.1%
-5.89561
< 0.1%
-5.88471
< 0.1%
-5.821
< 0.1%
-5.81891
< 0.1%
-5.7851
< 0.1%
-5.77111
< 0.1%
ValueCountFrequency (%)
37.1031
< 0.1%
37.0981
< 0.1%
36.2641
< 0.1%
35.8221
< 0.1%
35.4611
< 0.1%
35.4061
< 0.1%
35.3951
< 0.1%
35.211
< 0.1%
35.1611
< 0.1%
35.0451
< 0.1%

AFDP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20495
Distinct (%)55.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.925517714
Minimum2.0874
Maximum7.6106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:39.747313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.0874
5-th percentile2.6666
Q13.3556
median3.9377
Q34.3769
95-th percentile5.31142
Maximum7.6106
Range5.5232
Interquartile range (IQR)1.0213

Descriptive statistics

Standard deviation0.7739355929
Coefficient of variation (CV)0.1971550377
Kurtosis0.2246259001
Mean3.925517714
Median Absolute Deviation (MAD)0.4949
Skewness0.381096574
Sum144196.0422
Variance0.598976302
MonotonicityNot monotonic
2022-03-04T13:42:39.887013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.00769
 
< 0.1%
3.21159
 
< 0.1%
4.1768
 
< 0.1%
4.16018
 
< 0.1%
3.70438
 
< 0.1%
3.70568
 
< 0.1%
4.10838
 
< 0.1%
4.258
 
< 0.1%
3.52978
 
< 0.1%
3.87338
 
< 0.1%
Other values (20485)36651
99.8%
ValueCountFrequency (%)
2.08741
< 0.1%
2.09921
< 0.1%
2.10571
< 0.1%
2.11971
< 0.1%
2.13951
< 0.1%
2.14411
< 0.1%
2.15171
< 0.1%
2.15971
< 0.1%
2.16731
< 0.1%
2.1851
< 0.1%
ValueCountFrequency (%)
7.61061
< 0.1%
7.55491
< 0.1%
7.31891
< 0.1%
7.23991
< 0.1%
6.98311
< 0.1%
6.97791
< 0.1%
6.9561
< 0.1%
6.93121
< 0.1%
6.9271
< 0.1%
6.92591
< 0.1%

TIT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct799
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1081.428084
Minimum1000.8
Maximum1100.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:40.037139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1000.8
5-th percentile1047.8
Q11071.8
median1085.9
Q31097
95-th percentile1100.1
Maximum1100.9
Range100.1
Interquartile range (IQR)25.2

Descriptive statistics

Standard deviation17.53637294
Coefficient of variation (CV)0.01621594001
Kurtosis-0.0457552994
Mean1081.428084
Median Absolute Deviation (MAD)12.9
Skewness-0.8882780436
Sum39724097.8
Variance307.5243757
MonotonicityNot monotonic
2022-03-04T13:42:40.185044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11002735
 
7.4%
1099.92158
 
5.9%
1100.11322
 
3.6%
1099.8870
 
2.4%
1100.2527
 
1.4%
1099.7324
 
0.9%
1100.3260
 
0.7%
1099.6186
 
0.5%
1085.4143
 
0.4%
1086.5137
 
0.4%
Other values (789)28071
76.4%
ValueCountFrequency (%)
1000.81
< 0.1%
1001.31
< 0.1%
1001.42
< 0.1%
1002.91
< 0.1%
1006.51
< 0.1%
1007.91
< 0.1%
10091
< 0.1%
1009.51
< 0.1%
1011.41
< 0.1%
1011.71
< 0.1%
ValueCountFrequency (%)
1100.91
 
< 0.1%
1100.81
 
< 0.1%
1100.71
 
< 0.1%
1100.63
 
< 0.1%
1100.515
 
< 0.1%
1100.487
 
0.2%
1100.3260
 
0.7%
1100.2527
 
1.4%
1100.11322
3.6%
11002735
7.4%

GTEP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12967
Distinct (%)35.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.56380138
Minimum17.698
Maximum40.716
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:40.314347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum17.698
5-th percentile19.251
Q123.129
median25.104
Q329.061
95-th percentile32.9
Maximum40.716
Range23.018
Interquartile range (IQR)5.932

Descriptive statistics

Standard deviation4.195957462
Coefficient of variation (CV)0.1641366791
Kurtosis-0.6538527404
Mean25.56380138
Median Absolute Deviation (MAD)2.488
Skewness0.3290213527
Sum939035.116
Variance17.60605903
MonotonicityNot monotonic
2022-03-04T13:42:40.470045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24.30818
 
< 0.1%
24.67216
 
< 0.1%
25.46316
 
< 0.1%
25.18415
 
< 0.1%
25.10615
 
< 0.1%
25.48715
 
< 0.1%
25.55714
 
< 0.1%
25.35214
 
< 0.1%
25.44314
 
< 0.1%
25.29914
 
< 0.1%
Other values (12957)36582
99.6%
ValueCountFrequency (%)
17.6981
< 0.1%
17.7191
< 0.1%
17.7381
< 0.1%
17.7411
< 0.1%
17.7611
< 0.1%
17.8261
< 0.1%
17.8572
< 0.1%
17.8621
< 0.1%
17.8782
< 0.1%
17.9121
< 0.1%
ValueCountFrequency (%)
40.7161
< 0.1%
40.1061
< 0.1%
39.371
< 0.1%
38.9221
< 0.1%
38.3621
< 0.1%
38.1711
< 0.1%
38.0511
< 0.1%
37.8771
< 0.1%
37.8731
< 0.1%
37.8641
< 0.1%

AP
Real number (ℝ≥0)

HIGH CORRELATION

Distinct791
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1013.070165
Minimum985.85
Maximum1036.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:40.613506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum985.85
5-th percentile1003.3
Q11008.8
median1012.6
Q31017
95-th percentile1024.3
Maximum1036.6
Range50.75
Interquartile range (IQR)8.2

Descriptive statistics

Standard deviation6.463345955
Coefficient of variation (CV)0.00637995884
Kurtosis0.4419933185
Mean1013.070165
Median Absolute Deviation (MAD)4.1
Skewness0.194121007
Sum37213106.37
Variance41.77484093
MonotonicityNot monotonic
2022-03-04T13:42:40.756995image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1012.1297
 
0.8%
1010.8288
 
0.8%
1011.8284
 
0.8%
1011.9284
 
0.8%
1011.1283
 
0.8%
1012.2281
 
0.8%
1010.9279
 
0.8%
1012276
 
0.8%
1012.6276
 
0.8%
1012.7275
 
0.7%
Other values (781)33910
92.3%
ValueCountFrequency (%)
985.851
< 0.1%
986.161
< 0.1%
986.251
< 0.1%
986.412
< 0.1%
986.431
< 0.1%
986.561
< 0.1%
986.781
< 0.1%
986.871
< 0.1%
987.311
< 0.1%
987.431
< 0.1%
ValueCountFrequency (%)
1036.61
 
< 0.1%
1036.52
< 0.1%
1036.42
< 0.1%
1036.34
< 0.1%
1036.21
 
< 0.1%
10361
 
< 0.1%
1035.83
< 0.1%
1035.72
< 0.1%
1035.62
< 0.1%
1035.52
< 0.1%

CDP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4447
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.06052515
Minimum9.8518
Maximum15.159
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:40.889141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum9.8518
5-th percentile10.385
Q111.435
median11.965
Q312.855
95-th percentile13.989
Maximum15.159
Range5.3072
Interquartile range (IQR)1.42

Descriptive statistics

Standard deviation1.088795301
Coefficient of variation (CV)0.09027760296
Kurtosis-0.6315875791
Mean12.06052515
Median Absolute Deviation (MAD)0.637
Skewness0.2367915709
Sum443019.2702
Variance1.185475207
MonotonicityNot monotonic
2022-03-04T13:42:41.025156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.89155
 
0.1%
11.87247
 
0.1%
11.90846
 
0.1%
11.90245
 
0.1%
11.89944
 
0.1%
11.90143
 
0.1%
11.83943
 
0.1%
11.83543
 
0.1%
11.91643
 
0.1%
11.93741
 
0.1%
Other values (4437)36283
98.8%
ValueCountFrequency (%)
9.85181
< 0.1%
9.87081
< 0.1%
9.87541
< 0.1%
9.88061
< 0.1%
9.90441
< 0.1%
9.90461
< 0.1%
9.91781
< 0.1%
9.92391
< 0.1%
9.92441
< 0.1%
9.92861
< 0.1%
ValueCountFrequency (%)
15.1591
< 0.1%
15.0831
< 0.1%
15.0811
< 0.1%
15.0551
< 0.1%
15.0431
< 0.1%
15.0421
< 0.1%
15.0391
< 0.1%
15.0311
< 0.1%
15.0291
< 0.1%
15.0021
< 0.1%

TAT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct2769
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean546.1585171
Minimum511.04
Maximum550.61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:41.164895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum511.04
5-th percentile529.96
Q1544.72
median549.88
Q3550.04
95-th percentile550.3
Maximum550.61
Range39.57
Interquartile range (IQR)5.32

Descriptive statistics

Standard deviation6.842360433
Coefficient of variation (CV)0.01252815844
Kurtosis2.016791705
Mean546.1585171
Median Absolute Deviation (MAD)0.26
Skewness-1.755907087
Sum20062040.81
Variance46.8178963
MonotonicityNot monotonic
2022-03-04T13:42:41.312733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
550.01657
 
1.8%
550648
 
1.8%
549.98639
 
1.7%
549.99628
 
1.7%
549.96625
 
1.7%
550.04611
 
1.7%
549.97607
 
1.7%
550.03590
 
1.6%
550.02590
 
1.6%
549.94584
 
1.6%
Other values (2759)30554
83.2%
ValueCountFrequency (%)
511.041
< 0.1%
512.451
< 0.1%
512.62
< 0.1%
513.061
< 0.1%
513.091
< 0.1%
513.171
< 0.1%
513.291
< 0.1%
513.471
< 0.1%
513.751
< 0.1%
514.31
< 0.1%
ValueCountFrequency (%)
550.611
 
< 0.1%
550.61
 
< 0.1%
550.591
 
< 0.1%
550.572
 
< 0.1%
550.563
 
< 0.1%
550.554
 
< 0.1%
550.542
 
< 0.1%
550.535
< 0.1%
550.528
< 0.1%
550.5111
< 0.1%

TEY
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6236
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133.5064035
Minimum100.02
Maximum179.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size287.1 KiB
2022-03-04T13:42:41.467746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum100.02
5-th percentile109.03
Q1124.45
median133.73
Q3144.08
95-th percentile161.33
Maximum179.5
Range79.48
Interquartile range (IQR)19.63

Descriptive statistics

Standard deviation15.61863437
Coefficient of variation (CV)0.1169879044
Kurtosis-0.5001962549
Mean133.5064035
Median Absolute Deviation (MAD)9.76
Skewness0.1165547708
Sum4904090.72
Variance243.9417396
MonotonicityNot monotonic
2022-03-04T13:42:41.624418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
133.78185
 
0.5%
133.74174
 
0.5%
133.76168
 
0.5%
133.67163
 
0.4%
133.79149
 
0.4%
133.72145
 
0.4%
133.75141
 
0.4%
133.73140
 
0.4%
133.77136
 
0.4%
133.68135
 
0.4%
Other values (6226)35197
95.8%
ValueCountFrequency (%)
100.021
< 0.1%
100.031
< 0.1%
100.041
< 0.1%
100.071
< 0.1%
100.141
< 0.1%
100.171
< 0.1%
100.22
< 0.1%
100.221
< 0.1%
100.321
< 0.1%
100.361
< 0.1%
ValueCountFrequency (%)
179.51
< 0.1%
178.311
< 0.1%
177.911
< 0.1%
177.881
< 0.1%
177.491
< 0.1%
176.911
< 0.1%
176.711
< 0.1%
176.551
< 0.1%
176.351
< 0.1%
176.251
< 0.1%

Interactions

2022-03-04T13:42:37.349525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:28.447774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.627432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.787276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.870956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.142756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.212200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.267714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.335613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.465286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:28.590853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.781713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.895340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.993057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.254699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.368374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.386149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.450029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.571314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:28.727572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.884309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.993039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.118436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.359874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.463511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.489723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.553631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.683917image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:28.849030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.005699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.098654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.232277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.472922image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.574140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.605371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.660340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.793104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:28.972071image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.155217image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.205267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.340693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.584769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.699840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.712306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.774771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.905690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.106723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.270813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.330063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.441779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.692183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.804783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.821783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.885331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:38.056580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.247142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.390208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.455046image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.806015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.811019image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.913426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.934183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.004544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:38.188669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.363516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.503771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.592972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:32.913541image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.946267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.036937image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.064534image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.113044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:38.601451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:29.488052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:30.642358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:31.741483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:33.029647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:34.057279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:35.155046image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:36.192453image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-04T13:42:37.229923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-03-04T13:42:41.753647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-04T13:42:41.887641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-04T13:42:42.026570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-04T13:42:42.162596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-04T13:42:38.764728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-04T13:42:38.937731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AHATAFDPTITGTEPAPCDPTATTEY
090.2629.37792.39271043.619.1661020.110.564541.16110.16
189.9349.29852.37321039.919.1191019.910.572538.94109.23
289.8689.13372.38541041.019.1781019.810.543539.47109.62
389.4908.97152.38251037.119.1801019.310.458536.89108.88
489.0999.01572.40441043.519.2061019.110.464541.25110.09
588.7839.04652.38261042.519.3041019.010.461540.14110.23
688.8538.86492.42371043.519.2691019.010.480540.87110.53
788.7608.98622.44091043.719.4461019.210.475540.65110.61
888.6248.99562.39591037.319.2171019.510.463536.89109.03
988.5618.98362.40671039.719.1401019.610.451538.61109.28

Last rows

AHATAFDPTITGTEPAPCDPTATTEY
3672398.38810.45403.55551053.418.9371004.510.327550.03110.78
3672499.28210.30503.53391053.318.9091004.610.328550.00110.78
3672599.99510.23803.88051067.521.2061004.611.002550.32121.26
36726100.17010.34704.31981084.324.0481004.911.685549.98133.74
3672799.98510.15503.70431059.719.8371005.110.570549.90115.52
3672898.4609.03013.54211049.719.1641005.610.400546.21111.61
3672999.0937.88793.50591046.319.4141005.910.433543.22111.78
3673099.4967.26473.47701037.719.5301006.310.483537.32110.19
3673199.0087.00603.44861043.219.3771006.810.533541.24110.74
3673297.5336.92793.42751049.919.3061007.210.583545.85111.58

Duplicate rows

Most frequently occurring

AHATAFDPTITGTEPAPCDPTATTEY# duplicates
195.93823.1564.05471076.624.6721004.211.835549.87127.015
087.32826.0675.07031099.129.9841008.313.038546.78146.144